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Abstract: African primates remain an unexplored source of information required to complete the origin and evolution of 
many human pathogens. Current studies have shown the importance of several receptor human genes implicated in host 
resistance or susceptibility to tuberculosis. The validation of these genes in Mycobacterium tuberculosis infection makes 
them an excellent model system to investigate the mode of selective pressures that may act on pathogen defense genes. To 
trace the evolutionary history of these genes, the report describes preliminary results for eight receptors human genes having 
either a significant or a possible association with Tuberculosis (TB). By using a combination of maximum likelihood 
approaches, evidence of positive selection were detected for four genes. The analysis between species, nevertheless, shows a 
clear pattern of nucleotide variation mostly compatible with purifying selection. 
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1. Introduction 

Although the Mycobacterium tuberculosis complex 
infects a third of all humans, little is known regarding the 
prevalence of mycobacterial infection in non-human 
primates (NHP). For more than a century, tuberculosis has 
been regarded as a serious infection threat to NHP species. 
Just as humans, many mycobacteria are etiologic agents of 
disease in non-human primates with a diverse degree of 
susceptibility to the disease in different primate genera [1-3]. 

Comparisons with genomes of closely related species are 
necessary in order to achieve a more complete 
understanding of the evolution of the human genome. 

Thus far, differential gene expression, gene loss, gene 
duplication, and natural selection have received the most 
attention as the molecular mechanisms responsible for the 
differences in protein activity in these species [4, 5]. 

Current studies have showed the importance of numerous 
human genes implicated in host resistance or susceptibility 
to tuberculosis (i.e. receptors; transporters, cytokines, 
chemokines, etc.) [6]. The validation of these genes in 
Mycobacterium tuberculosis infection makes them an 
excellent model system to investigate the mode of selective 
pressures that may act on pathogen defense genes. 



Frequent positive selection is a hallmark of genes 
involved in the adaptive immune system of vertebrate, but 
the prevalence of positive selection for genes underlying 
immunity in vertebrates has not been well studied. The 
identification of genes subjected to positive selection, can 
lead to predictions of putative functionally important regions 
of genes [7]. Receptor genes of the innate immune system 
represent the first line of defense against pathogens. Of 
particular interest is the identification of human or 
primate-specific events of adaptive evolution. Humans 
diverged from African apes (chimpanzee, gorillas and 
orangutans) about 5-7 million years ago. Genetically, 
humans and chimpanzee share nearly 99% DNA sequence 
similarity [9-10], which seems to deny the marked biological 
and phenotypic divergence between them, i.e., the highly 
developed cognitive abilities in humans. The major 
challenge in the human/chimpanzee genome sequence era is 
to determine the small subset of sequence differences that 
have phenotypic significance related to species-specific 
traits [11]. 

Interspecific comparisons provide information about 
evolutionary processes acting over different timescales. The 
history of pathogenic diseases during primate evolution 
undoubtedly played a role in shaping the present immune 



58 Barbara Picone and Alan Christoffels: Molecular Evolution of Key Receptor Genes in Primates and Non-Human Primates 



system, and the forces acting on immune genes over this 
timescale can only be studied from inter-specific 
comparisons. 

Current studies have showed the importance of numerous 
human genes implicated in host resistance or susceptibility 
to tuberculosis (i.e, VDR, P2RX7, NOS2A, SPllO, CD209, 
CCI2, CXL10, GC, IFNG, IL10, PTPN22, SPllO, T1RAP, 
TLR2, TNFRSF1B). The aim of the present study is to extend 
the analysis of these genes in non-human primates to 
thoroughly assess their evolutionary history and to identify 
possible amino acids under selective pressure. This analysis 
focuses on a fraction of these human genes interestingly 
having either a significant or a possible association with 
tuberculosis: ( VDR, P : X7, TNFRSF1B, 1L12RB1, CD209, 
MC 3 R, TIRAP, 1FNGR1). In this scenario, future studies are 
fundamental to understand the evolution of other receptor. 

2. Material and Methods 

2.1. Sequences 

The sequences of the primates genes used in the analyses 
were retrieved from the public database “Ensembl Genome 
Browser”. For each gene, a subset of 4-11 species was used, 
that included species from the most representative primate 
groups (Old World primates, New World primates, and 
Prosimians). 

2.2. Codon-Based Analyses of Positive Selection 

To evaluate positive and negative selections at all the 
genes during primate evolution, we compared the rate per 
site of non-synonymous substitution ( dN) to the rate per site 
of synonymous substitutions (dS) in a maximum likelihood 
(ML) framework (M7/M8, SLAC, FEL, REL and SLR). A 
ratio of dN/dS >1 is interpreted as strong evidence of 
positive selection, whereas a dN/dS <1 is evidence of 
purifying selection. For each gene, a neighbor-joining tree 
was used as the working topology, which was constructed 
using Mega v.5 with the option p-distance as the substitution 
model and complete deletion to gaps and missing data [12]. 
Two alternative models in CODEML M7 and M8 were then 
implemented (PAML v. 4) [13]. 

M7 allows only codons to evolve neutrally or under 
purifying selection while M8 adds a class of sites under 
positive selection. The two previous nested models were 
compared using a likelihood ratio test (LRT) with 2 degrees 
of freedom [14]. Amino acids under selection for M8 were 
indentified using Bayes Empirical Bayes approach (BEB) 
with posterior probability >90%. 

It is more difficult to identify specific sites under selection, 
than to show that proportion of the sites are under selection. 
Therefore, only sites with a posterior probability >90% were 
considered as candidates. Next, a series of ML methods were 
applied to the data set in order to test for positive selection in 
individual codons of genes sequences: The Hyphy package 
implemented in the Data Monkey Web server [15] and the 
SLR method [16]. 



In the Data Monkey Web server, the best fitting nucleotide 
substitution model was searched for through the automatic 
model selection tool available on the server. 

All sequences of each gene were analyzed under three 
distinct models, single likelihood ancestor counting (SLAC), 
fixed-effect likelihood (FEL) and random effect likelihood 
(REL). The SLAC model is based on the reconstruction of 
the ancestral sequences and the counts of d.v and dA at each 
codon position of the phylogeny. The FEL model estimates 
the ratio dN/dS on a site-by-site basis, without assuming a 
priori distribution across sites. The REL model first fits a 
distribution of rates across sites and then infers the 
substitution rate for individual sites. Sites with P values <0.1 
for SLAC and FEL, and Bayes factor >50 for REL were 
considered as candidates to be under positive selection. 

The SLR method is complementary to the random sites 
model implemented in PAML. It performs an explicit 
likelihood-ratio test for selection at each site in the 
alignment, making few assumptions about the distribution of 
selection and allowing every site to be under a different level 
of evolutionary constraint. Further, a series of python scripts 
were written, in order to identify nucleotide changes 
positively selected for each gene (data not shown). 

3. Results and Discussion 

Genes of the immune systems and genes involved in the 
host-pathogen interaction have been shown to be highly 
prone to adaptive selection [17,18]. Despite several studies 
on the evolution of these genes in human and non-human 
primates, a clear picture of the evolution of these gene 
families has not emerged, probably because previous studies 
have considered samples within species or between species, 
but not both. Genes associated with the immune system are 
under constant evolutionary pressure to change as a result of 
host-parasite co-evolution, where advantageous mutations 
are heavily favored. 

Using ML approaches, evidence of positive or negative 
selection was detected in all genes studied (Table 1). 

Nested models with or without positive selection were 
compared using LRTs. As result for four of eight genes 
(. IFNGR1 , TIRAP, 1L12RB1 and CD209), a model that 
includes sites with dN/dS >1 fits the data significantly better 
than a neutral model (Table 1). For each of these nine genes, 
the proportion of sites under selection according the M8 
model in PAML was relatively low. The specific codons 
identified by the BEB approach with a posterior probability 
of 90% represent an even smaller fraction of that proportion. 

The other ML methods also detected sites under selection, 
some of which coincide with the codons previously 
identified by M8. To identify robust candidates for sites 
under selection, the analysis considered sites with evidence 
of selection in at least one site that was concordant among 
methods (Table 1). 1FNGR1 presents the highest number of 
positive selected codons (13), whereas VDR, P2RX7, 
TNFRS1, MC3R genes showed evidence of being or having 
been under a regime of negative selection. 
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To examine the phylogenetic distribution of the inferred 
positively selected changes among primate clades 
(Prosimians, New World monkey. Old world monkey, and 
apes), the current study investigated the dN/dS ratio among 
lineages for each branches of gene phylogenetic tree. The 
one -ratio model, which assumes an equal co-ratio for all 
branches in the phylogeny, was then compared with the 
free-ratio model, which assumes an independent co-ratio for 
each branch. The difference between the two models was 
significant (0.01< P > 0.05), indicating that there is variable 



selective pressure in the phylogeny. Interestingly, the 
branches with dN/dS >1 was found among Old work 
monkeys and apes (data not shown). 

When three-dimensional structures were available, it was 
possible to look at the functional significance and location of 
all positive selected sites in those genes identified by the ML 
methods. For two genes ( TIRAP and IFNGR1 ) [19, 20] the 
analyses could detect several sites that fall in or immediately 
adjacent to regions or residues suggested to affect function 
such as sites involved in dimerization surface. 



Table 1 . Phylogenetic test of Positive Selection 



Gene 


No. 

Species 


InL M7 InL M8 
(neutral) (selection) 


-2lnAI, 


Significance 


ps, cos 


PAML M8 


SRL 


SLAC 


FEL 


REL 


Tot. 

sites 
















194, 201, 










VDR 


7 


-2236,72 -2236,6 


0.24 


NS 






204, 209, 
215,218 




222 






P2RX7 


6 


-3179,55 -3179,56 


0.02 


NS 






86, 348, 
367,441 










TNFRS1B 


6 


-2754,38 -2753,05 


2.66 


NS 




2,9,23,27,36, 


527, 491 
13,31,52, 
72,80,118, 






















39,41,48,65, 


133,135,143, 






















82,87,90,91, 


146,156, 






31,52,56, 
















107,112,135, 


172, 




52,65,72,79, 


118,143 














0.007; 

14.81 


145,148,157, 


204,246, 




96,113,124, 


171,181 




IFNGR1 


6 


-3416,13 -3412,95 


6.36 


** 


161.179,197, 


259, 




181,246,268, 


204, 223, 


13 












204,212,215, 


266,315, 




233,323,368, 


246, 268, 
















217,246,265, 


376, 




373, 394 


323,368 
















271,314,340, 


392,394, 






373, 394 
















353,360,372, 


406, 






















381,407,424 


425,435, 
464, 486 










MC3R 


4 


-1317,41 -1317,01 


0.8 


NS 






350, 79 
44, 51, 53, 






81 




TIRAP 


7 


-1229,6 -1226,34 


6.52 


** 


0.04; 

1.74 


22, 44, 81 


54,81,116, 
127, 95, 102, 
104, 108 
109, 113, 
116, 118, 


81 






2 


IL12RB1 


6 


-2789,97 -2779,083 


21.78 


** 


0.16; 

3.54 


3,34, 35, 
38,39,46,262, 
266, 27, 279, 
302 


125, 129, 
131, 133, 
165, 169, 
177, 179, 
369, 403, 
410,424, 
435 

24, 35, 39, 

40,45,91, 

99,143,144, 


58,165 






1 








CD209 


8 


-1902,79 -1898,85 


7.88 


** 


0.03;4.47 


69, 176, 207 


158,285, 
312, 
350,367, 
378, 423 




53, 158,211 


158 


1 



To evaluate the level of selective constraint among genes, 
the global dN/dS were estimated for each gene over the 
primate phylogeny as well as for the human lineage (Table 
2). Genes inferred as positive selected form the substitution 
pattern (1FNGR1, CD209, TIRAP, IL12) show that on 
average they evolve faster that those negative selected ( t test 
P=0.007). The domain-specific dN/dS value further shows 
that the extracellular and transmembrane domains evolve 
faster that the cytoplasmic domain in most of the genes. 



The present work describes levels of variation among 
primates and humans for each receptor gene discussed above 
with the aim to provide a general picture of their evolution 
over different timescales. 

Signatures of positive selection were detected in the rates of 
substitution across primates in most of the genes analyzed. 
Several ML methods identified specific codons with high 
probability of being under selection. Undoubtedly, the 
identification of sites subjected to positive selection does 
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conclusively provide insights into the function of a molecule. 
The comparison of human-primate gene family repertories is 
an important step in order to clarify and identify selective 
pressure driving adaptive evolution. 

Future investigation such as functional consequences of 
possible polymorphisms needed to be explored to detect the 
functional relevance of these receptor genes. 



4. Conclusions 

The identification of genes and gene regions subjected to 
positive selection can lead to predictions regarding the 
putative functionally important regions of genes. African 
primates remain an unknown font of information 
fundamental to complete the origin of many human 
pathogens. The close relationship between humans and 
chimpanzees is an important link. It may be fundamental to 
further research and establish primate specific selective 
events, in order to understand the evolutionary 
history-giving rise to humans. 

Additionally crystallographic studies would be helpful for 
assessing the functional relevance of positive selected 
codons detected. 



Table 2. Selective pressure 





Human branch 


Global 


SP 


EXT 


TM 


CYT 


P2RX7 


0.14 


0.14 


0.12 


0.60 


0.27 


n/a 


TNRF 


0.09 


0.28 


0.40 


0.10 


n/a 


0.5 


MC3R 


0.17 


0.05 


0.10 


0.32 


0.41 


0.10 


IL12RB1 


0.28 


0.38 


n/a 


0.40 


0.50 


0.04 


CD209 


0.51 


0.44 


n/a 


0.31 


n/a 


0.16 


TIRAP 


0.49 


0.11 


0.25 


0.42 


0.11 


0.12 


IFNGR1 


0.25 


0.60 


0.50 


0.30 


0.32 


0.50 


VDR 


0.35 


0.07 


0.04 


0.06 


n/a 


n/a 



SP= signal peptide; EXT=extracellular domain; TM=transmembrane 
domain; CYT= cytoplasmatic domain 
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